Random Walk Factoid Annotation for Collective Discourse

نویسندگان

  • Ben King
  • Rahul Jha
  • Dragomir R. Radev
  • Robert Mankoff
چکیده

In this paper, we study the problem of automatically annotating the factoids present in collective discourse. Factoids are information units that are shared between instances of collective discourse and may have many different ways of being realized in words. Our approach divides this problem into two steps, using a graph-based approach for each step: (1) factoid discovery, finding groups of words that correspond to the same factoid, and (2) factoid assignment, using these groups of words to mark collective discourse units that contain the respective factoids. We study this on two novel data sets: the New Yorker caption contest data set, and the crossword clues data set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discourse Complements Lexical Semantics for Non-factoid Answer Reranking

We propose a robust answer reranking model for non-factoid questions that integrates lexical semantics with discourse information, driven by two representations of discourse: a shallow representation centered around discourse markers, and a deep one based on Rhetorical Structure Theory. We evaluate the proposed model on two corpora from different genres and domains: one from Yahoo! Answers and ...

متن کامل

Agreement in Human Factoid Annotation for Summarization Evaluation

Factoid analysis was introduced by (van Halteren and Teufel, 2003) as an objective, yet semantics-oriented way of measuring overlap of information rather than surface strings in summaries. In this paper, we report on annotation experiments with two sets of summaries, and on a factoid-pairing program which finds correlations between factoids semi-automatically.

متن کامل

Evaluating Information Content by Factoid Analysis: Human annotation and stability

We present a new approach to intrinsic summary evaluation, based on initial experiments in van Halteren and Teufel (2003), which combines two novel aspects: comparison of information content (rather than string similarity) in gold standard and system summary, measured in shared atomic information units which we call factoids, and comparison to more than one gold standard summary (in our data: 2...

متن کامل

Refining Image Annotation by Integrating PLSA with Random Walk Model

In this paper, we present a new method for refining image annotation by integrating probabilistic latent semantic analysis (PLSA) with random walk (RW) model. First, we construct a PLSA model with asymmetric modalities to estimate the posterior probabilities of each annotating keywords for an image, and then a label similarity graph is constructed by a weighted linear combination of label simil...

متن کامل

Collective Media Annotation using Random Field Models

We present methods for semantic annotation of multimedia data. The goal is to detect semantic attributes (also referred to as concepts) in clips of video via analysis of a single keyframe or set of frames. The proposed methods integrate high performance discriminative single concept detectors in a random field model for collective multiple concept detection. Furthermore, we describe a generic f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013